Weighted distance measures for metabolomic data

نویسندگان

  • Philip M. Dixon
  • Lankun Wu
  • Mark P. Widrlechner
  • Eve Syrkin Wurtele
چکیده

Motivation: Many analyses of metabolomic data depend on the choice of distance measure, but it is unclear how to make an appropriate choice. The choice is especially unclear when the variance in metabolite abundance is not constant. Methods: We describe a class of weighted distance measures that account for non-constant variance in metabolite abundance by using a model of the relationship between the variance and the mean. We develop two methods to assess the performance of a distance measure. One method measures repeatability across bootstrap collections of metabolites; the second assesses agreement with patterns expected to be found in the data. These methods were used to compare seven distance measures by evaluating data on 58 metabolites measured in 40 accessions of Echinacea, a genus of plants widely used as botanical supplements. Results: A new precision-weighted Manhattan distance and the Canberra distance are the most repeatable and the most in agreement with the expected pattern. Distances based on standardized data are intermediate in performance, and unweighted Manhattan or Euclidean distance measures are the least repeatable and the least in

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid execution of weighted edit distances

The comparison of large numbers of strings plays a central role in ontology matching, record linkage and link discovery. While several standard string distance and similarity measures have been developed with these explicit goals in mind, similarities and distances learned out of the data have been shown to often perform better with respect to the F-measure that they can achieve. Still, the pra...

متن کامل

Decision Making with Distance Measures, Weighted Averages and Induced Owa Operators

We develop a new decision making model by using distance measures, weighted averages and OWA operators. We introduce the induced ordered weighted averaging – weighted averaging distance (IOWAWAD) operator. We study some of its main properties and particular cases such as the weighted Hamming distance, the induced OWA distance (IOWAD), the arithmetic weighted distance and the arithmetic IOWAD op...

متن کامل

First Name Last Name Title

Applying weighted network measures to distance matrices Many approaches to the analysis of weighted networks are not designed for fully connected weighted networks. However, as any distance matrix between objects is a fully connected weighted network, such networks are extremely common. In earlier work we derived an approach for the analysis of weighted networks which also works on fully connec...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009